home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group98b.txt
/
000026_icon-group-sender _Thu May 14 12:24:44 1998.msg
< prev
next >
Wrap
Internet Message Format
|
2000-09-20
|
2KB
Return-Path: <icon-group-sender>
Received: from kingfisher.CS.Arizona.EDU (kingfisher.CS.Arizona.EDU [192.12.69.239])
by baskerville.CS.Arizona.EDU (8.8.8/8.8.7) with SMTP id MAA12478
for <icon-group-addresses@baskerville.CS.Arizona.EDU>; Thu, 14 May 1998 12:24:34 -0700 (MST)
Received: by kingfisher.CS.Arizona.EDU (5.65v4.0/1.1.8.2/08Nov94-0446PM)
id AA22110; Thu, 14 May 1998 12:24:30 -0700
Message-Id: <199805141509.RAA25422@capway.com>
From: "Vladimir Grodzenski" <grodzens@capway.com>
To: icon-group@optima.CS.Arizona.EDU
Date: Thu, 14 May 1998 17:05:33 +0000
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
Subject: Re: AI use for Icon
Reply-To: <vladimir.grodzenski@capway.com>
Priority: urgent
In-Reply-To: <199805061704.MAA14525@axp.cmpu.net>
Errors-To: icon-group-errors@optima.CS.Arizona.EDU
Status: RO
Content-Length: 1140
On 6 May 98 at 12:04, Gordon Peterson wrote:
> For the "fuzzy match" I think that one interesting way to at least
> help winnow down the possibilities would be to examine the
> intersection of the character sets of the different names. Those
> which have a high intersection (all but a "few" characters) can be
> then examined more closely.
>
> For a better "fuzzy compare" function I've liked the use of
> overlapping character pairs (including a blank added to the start
> and end of each name).
Another approach:
for each employee 'name' create a table, the keys of which
will be the characters from 'name' and values - their occurence in
'name'. Such as:
T := table(0)
every T [!name] +:= 1
We can define "sort of a difference" between two tables (T1,T2):
procedure tdiff(T1, T2)
local T, weight
weight := 0
T := table(0)
every k := key(T1) | key(T2) do
T [k] := abs( T1[k] - T2[k] )
every weight +:= !T
return weight
end
Vladimir Grodzenski
=================================================
E-mail: vladimir.grodzenski@capway.com
CompuServe: 100700,526
=================================================